A Microblock Density-Based Similarity Measure for Graph Clustering
نویسندگان
چکیده
Graph clustering is an important technique in data mining and network analysis, and it is widely used in chemistry, physics, biology, communication, and computer science. Similarities between vertices of a graph are the fundamental conditions for many hierarchical clustering algorithms. In the paper, we propose a new similarity measure based on microblock density, which computes the similarity between a pair of vertices by calculating the densities of their common adjacent microblock. This measure extends the scope and improves the discrimination of traditional measure, thus significantly improving the performance and stability of the similarity-based clustering algorithms. Experiments on synthetic data and real networks show that the density-based similarity approach accurately reflects the local structure of the graph and provides higher accuracy similarities for clustering and community structural detection algorithms than other state-of-the-art methods.
منابع مشابه
خوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملAnalysis of Resting-State fMRI Topological Graph Theory Properties in Methamphetamine Drug Users Applying Box-Counting Fractal Dimension
Introduction: Graph theoretical analysis of functional Magnetic Resonance Imaging (fMRI) data has provided new measures of mapping human brain in vivo. Of all methods to measure the functional connectivity between regions, Linear Correlation (LC) calculation of activity time series of the brain regions as a linear measure is considered the most ubiquitous one. The strength of the dependence obl...
متن کاملAn Empirical Comparison of Distance Measures for Multivariate Time Series Clustering
Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...
متن کاملFundamental Limitations of Spectral Clustering
Spectral clustering methods are common graph-based approaches to clustering of data. Spectral clustering algorithms typically start from local information encoded in a weighted graph on the data and cluster according to the global eigenvectors of the corresponding (normalized) similarity matrix. One contribution of this paper is to present fundamental limitations of this general local to global...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JCP
دوره 10 شماره
صفحات -
تاریخ انتشار 2015